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SUMMARY For decoding non-binary low-density parity- 
check (LDPC) codes, logarithm-domain sum-product (Log-SP) 
algorithms were proposed for reducing quantization effects of SP 
algorithm in conjunction with FFT. Since FFT is not applica- 
ble in the logarithm domain, the computations required at check 
nodes in the Log-SP algorithms arc computationally intensive. 
What is worth, check nodes usually have higher degree than vari- 
able nodes. As a result, most of the time for decoding is used 
for check node computations, which leads to a bottleneck effect. 
In this paper, we propose a Log-SP algorithm in the Fourier do- 
main. With this algorithm, the role of variable nodes and check 
nodes are switched. The intensive computations arc spread over 
lower-degree variable nodes, which can be efficiently calculated 
in parallel. Furthermore, we develop a fast calculation method 
for the estimated bits and syndromes in the Fourier domain. 
key words: LDPC code, non-binary LDPC codes, belief propa- 
gation, Galois field, iterative decoding 

1. Introduction 

In 1963, Gallager invented binary low-density parity- 
check (LDPC) codes [1]. Due to the sparseness of 
the code representation, LDPC codes are efficiently de- 
coded by sum-product decoders (SP) [2] or Log-SP de- 
coders [3]. The Log-SP is also known as belief propa- 
gation. By the powerful method density evolution [3], 
invented by Richardson and Urbanke, messages of Log- 
SP decoding are statistically evaluated. The optimized 
LDPC codes can realize the reliable transmissions at 
rate very close to the Shannon limit [4]. 

The binary LDPC codes are defined by sparse 
parity-check matrices over GF(2). On the other hand, 
the non-binary LDPC codes are defined by sparse 
parity-check matrices over GF(2 m ) for 2 m > 2. Non- 
binary LDPC codes are invented by Gallager [1] and, 
Davey and MacKay [5] found non-binary LDPC codes 
can outperform binary ones. Non-binary LDPC codes 
have captured much attention recently due to their de- 
coding performance [6]-[10]. 

It is known that irregularity of Tanner graphs help 
improve the decoding performance of binary LDPC 
codes [4]. On the other hand, it is not the case for 
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the non-binary LDPC codes. The (j = 2, fc)-rcgular 
non-binary LDPC codes over GF(2 m ) are empirically 
known [11] as the best performing codes for 2 m > 64, 
especially for short code length. This means that, for 
designing non-binary LDPC codes, one does not need 
to optimize degree distributions of Tanner graphs, since 
(j = 2, fc)-regular non-binary LDPC codes are best. 
Therefore, we assume j = 2. Furthermore, the sparsity 
of (j = 2, fc)-regular Tanner graph leads to efficient de- 
coding. The coding rate is given by R — 1 — 2/k. It 
can be seen that k gets higher as R tends to 1. 

In this paper, we deal with the decoding algorithms 
of non-binary LDPC codes [5], which is applicable to 
the binary LDPC code. Despite of the efficient and 
parallel implementation of the Log-SP algorithm, the 
application of LDPC codes to the industry is limited 
so far. This is due to the requirement of large mem- 
ory devices and computationally intensive non-linear 
check-node computation for the decoders. The conven- 
tional decoding algorithm for non-binary LDPC codes, 
compared to the binary counterpart, is computationally 
complex and require more memories to store messages. 

Immediate use of the SP algorithm for non-binary 
LDPC codes over GF(2 m ) requires 0((k - l)2 2m ) ad- 
ditions and multiplications per check node and 0(2 m ) 
multiplications per variable node. By FFT and IFFT 
[12], the check node computation in the SP algorithm 
is largely reduced to 0{km2 m ) additions and multipli- 
cations. However, the SP algorithm with FFT is not 
robust to a quantization effect since the messages are 
recursively multiplied among them. The quantization 
effect can not be observed in normal PC-like comput- 
ers equipped with 32-bit FPUs and large memory de- 
vices. However, the use of such high quantization-level 
processors and large memory devices prevents realizing 
high-throughput decoders. 

In order to avoid the quantization effect or large 
memory devices, and to realize the high-throughput de- 
coders, the SP algorithms in the logarithm domain [13], 
[14] have been proposed. Multiplications are replaced 
with additions by treating the messages in the loga- 
rithm domain, which reduces the quantization effect. 
However, with the logarithm-domain SP algorithm, we 
have to give up the efficient calculation of check node 
computations, since FFT and IFFT can not be applied 
to the messages in the logarithm domain. The check 
node computation requires at most 0((k — l)2 2m ) times 



of ln(-) exp(-) and additions, or is approximated by 
simpler calculations [13], [14], which still requires much 
higher computations than variable nodes. 

In summary, we face the following problems when 
using the SP algorithm in the logarithm domain. 

• FFT and IFFT are not applicable to the check 
node computations. 

• The check node computation requires much higher 
computation than that of variable nodes. What is 
worse, check nodes have higher degree k = 2/(1 — 
R) > 2 than variable nodes of degree 2. 

It can be seen that the most intensive computations are 
calculated at the most crowded nodes, i.e., nodes of the 
highest degree. These problems cause a bottleneck at 
check node computation. Such a bottleneck problem 
can not be solved even by using fully node-parallel pro- 
cessors, since the node computations are triggered by 
incoming messages. Most of the time for decoding is 
used for the check node computations. 

In this paper, we propose a decoding algorithm 
whose messages consistently stays in the Log-Fourier 
domain. With this algorithm, the computations of vari- 
able nodes and check nodes are switched. Check nodes 
still have higher degree k than variable nodes of degree 
2, but the computations for the check node become 
much lower. On the other hand, variable nodes re- 
quire much more computations, but the degree is only 
2. Consequently, the computations for the decoding are 
spread all over the nodes. 

Note that our interest is not reducing the total 
amount of computations for decoding but reducing the 
most intensive computation among all the node. Such 
reduction is important in the situation that the com- 
putation for each node is operated in parallel. The 
proposed algorithm removes the obstacles which block 
squeezing the potential of parallel implementation of 
LDPC codes. 

2. Conventional Decoding Algorithms and 
Bottleneck Problem 

For simplicity, we consider a non-binary (j, fc)-regular 
LDPC code over GF(2™ 1 ). The extension to irregular 
LDPC codes is straightforward. Let N be the code 
length in terms of GF(2 m ) symbol, then the number of 
parity-check constraints is given as M = jN/k. Once a 
primitive element a of GF(2 m ) is fixed, each symbol is 
given m-bit representation [15, pp. 110]. For example, 
with a primitive element a G GF(2 3 ) such that a 3 + 
a + 1 = 0, each symbol is represented as = (0, 0, 0), 
1 = (1,0,0), a = (0,1,0), a 2 = (0,0,1), a 3 = (1,1,0), 
a 4 = (0,1,1), a 5 = (1,1,1) and a 6 = (1,0,1). We 
shall interchangeably use the two representations, i.e., 
x G GF(2" 1 ) as a symbol in the Galois field and a m-bit 
vector. 

In the binary representation, the codewords can 



be viewed as binary sequences of length mN . Given an 
M X N parity-check matrix H = {h cv } over GF(2 m ) 
of column weight j and row weight k, the non-binary 
LDPC code defined by H is given as 

{x G GF(2 m ) w | Hx = G GF(2 m ) M }. 

The c-th row of the parity-check matrix represents a 
parity-check equation 

hcVl^Vl ~t~ ' ' ' ~t~ h C y k Xy k 0, 

where h CVt , x Vi G GF(2 m ) for i = 1, . . . , k. 

The Tanner graph of the non-binary LDPC code 
is given by a bipartite graph of N variable nodes and 
M check nodes. The v-th. variable node and c-th check 
node are adjacent each other iff h cv ^ 0. For simplicity, 
we denote u-th variable node and c-th check node by 
v and c, respectively. Define V c as the set of adjacent 
variable nodes of a check node c. Similarly, define C v 
as the set of adjacent check nodes of a variable node v. 
For (j, fc)-regular LDPC codes, we have \V C \ = k and 
\C V \=3- 

The decoding algorithms for binary and non- 
binary LDPC are usually viewed as message-passing 
algorithms on the Tanner graphs. All the algorithms 
dealt in this paper involve the following 4 steps. 

1. INITIALIZATION : For each variable node v(v = 
1, . . . , N), the initial message is calculated from the 
channel output for the i>-th transmitted symbol. 
The variable node v sends the initial message to 
the adjacent check node c for c G CV The iteration 
round £ is set as £ := 0. 

2. CHECK TO VARIABLE : Each check node c(c = 
1, . . . , M) has k incoming messages sent from its 
k adjacent variable nodes. The check node c com- 
putes the outgoing messages to be sent to the k 
adjacent variable nodes. Increment the iteration 
round as £ := i + 1. 

3. VARIABLE TO CHECK : Each variable node v has j 
incoming messages sent from its j adjacent check 
nodes. With the initial message, the variable node 
v computes the outgoing messages to be sent to 
the j adjacent check nodes. 

4. TENTATIVE DECISION : For each variable node v, a 

tentatively estimated symbol xi^ G GF(2 m ) is cal- 
culated from the messages sent from its j adjacent 
check nodes and the initial message. If the ten- 
tatively estimated symbols (oij , ...,Xj^) form a 
codeword, the decoder outputs the codeword, oth- 
erwise repeats the steps 2, 3 and 4. If the iteration 
round £ reaches at a pre-dctcrmincd number, the 
decoder outputs FAIL. 

The subject in this paper is about the decoding 
algorithms for the non-binary LDPC codes. To em- 
phasize the difficulty of problems, we first review the 
conventional decoding algorithms for non-binary LDPC 



codes. 

2.1 Conventional Sum-Product Algorithm for Non- 
Binary LDPC Code 

In this section, we review the conventional decoding 
algorithm [16], [17] for non-binary LDPC codes over 
GF(2 m ), i.e., the SP algorithm. In the SP algorithm, 
the messages are represented as probability vectors over 
GF(2 m ). Each message is represented as a vector in 
[0,1] 2 . The algorithm is the symbol- wise maximum 
a posterior probability (symbol MAP) decoding if the 
Tanner graph is a tree. Even if the Tanner graph is 
not a tree, due to its sparseness, the algorithm can ap- 
proximates the symbol-MAP decoding. The following 
describes the conventional SP algorithm [16], [17]. 

INITIALIZATION : For each variable node v, the initial 
message is given as follows. 

p v °\x) = Py(X v = x\Y v = y v ), for x G GF(2 m ), 

where X v is the random variable of the v-th transmitted 
symbol, y v is the channel output of the v-th transmit- 
ted symbol and Y v is its random variable. The variable 
node v sends the message p^J — p^ G [0, l] 2 '" to c for 
c G C v . Set I :=0. 

CHECK TO VARIABLE : Each check node c has k incom- 
ing messages pi^ G [0, l] 27 "(w G V c ) sent from its k 
adjacent variable nodes. The check node c sends the 
message G [0, l] 2 '" to v for v G V c . 

j$(x)=p<£>(h- 1 x), for x G GF(2 m ) 

vt +1) = (g) P ( Z (i) 

v'ev a \{v} 

pt +1 \x) = p£ +1 \h cv x), for x G GF(2" 1 ), 

where p\ ®pi G [0, l] 2 '" is a convolution of p\ G [0, l] 2 "* 
and p 2 G [0, l] 2 '". To be precise, for x G GF(2 m ), 

{Pl®P2){x)= E Pl(xi)p2(x 2 )- 

x 1 ,x 2 £GF(2 m ):x=x 1 +x 2 

We denote p\ ® ■ ■ ■ ®pu by ®i =1 Pi - Increment the it- 
eration round as £ := £ + 1. 

VARIABLE TO CHECK : Each variable node v has j in- 
coming messages piv G [0,l] 2 '"(c G C v ) sent from its 
j adjacent check nodes. The variable node v sends the 
message pi^ to c for c G C v . 

p v t }{x)=p v °\x) pW(x) for x G GF(2 m ). 

c'ec„\{c} 

And normalize p£} so that X)xeGF(2 m ) ( x ) = 1 as 
follows. 



P%tx) ■■= P$(*)/ E P%W> for x e GF(2 m ). 

ic£GF(2 m ) 

The decoding output does not change even if this nor- 
malization step were replaced with 

P&(*) :=fi!W/^(0), for x G GF(2 m ). 

In this case, the messages are no longer probability vec- 
tors. 

TENTATIVE DECISION : The tentatively estimated sym- 
bol Xv G GF(2 m ) for the v-th transmitted symbol is 
given as 

x v e > — argmax q^(x), 

xeGF(2 m ) 

qM(x) ~ Pv °Hx) pW(x), for x G GF(2 m ). 

c'ec„ 

The calculation in Eq. (1) is the most complex part 
of the decoding. However, the convolution is efficiently 
calculated via the Fourier transforms [14]. For example, 
the fc-fold convolution 

k 

q = <g)Pi G [0,1] 2 '" 

is efficiently calculated via the Fourier transform, for 
i = 1, . . . , k, 

Pi(z) := J2 ft(*)(-l) 2 " for z G GF(2 m ), 

zeGF(2 m ) 

and component-wise multiplications, 

k 

Q(z) :=H^(z) forzGGF(2 m ), 

i=l 

and the inverse Fourier transform 

~ E WC- 1 )***. fOT x e GF(2 m ), 

:ceGF(2 m ) 

where z ■ x is the dot product of the binary rep- 
resentations of z and x. For example, for z = 
(1,1,0,1,1,1,0,0) and x = (1,0,0,1,1,0,1,1), z-x = 
1 + + + 1 + 1 + + + 0=3. The Fourier transform 
and inverse Fourier transform are efficiently calculated 
by FFT and IFFT [12] and [14]. 

By using FFT, the SP algorithm can be viewed as 
the iteration of FFT and component-wise multiplica- 
tions. Compared to additions, multiplications requires 
more complex computation devices and higher level 
quantizations. Hence, it is strongly desired to avoid 
multiplications in decoders, in order to meet the de- 
mand of the high speed and low quantization level de- 
coders. 



2.2 Logarithm-Domain Sum-Product Decoding for 
Non-Binary LDPC Codes 

In the SP algorithm, lots of multiplications are needed. 
Transforming the messages to logarithm domain, the 
multiplications can be done by additions. The follow- 
ing algorithm describes the logarithm-domain SP algo- 
rithm which is referred to as the Log-SP algorithm [13]. 

INITIALIZATION : For each variable node v, the initial 
message is given as follows. 

X^(x) = In {Yr(X v = x\Y v = y v )), for x G GF(2 m ) 

Each variable node v sends the message Aye ~— ^ 
[-00, 0] 2 " 1 to c for c G C v . Set I = 0. 

CHECK TO VARIABLE : Each check node c has k incom- 
ing messages A^i G [— oo,0] 2m (v G V c ) sent from its k 
adjacent variable nodes. The check node c sends the 
message Ac« +1 ^ G [— 00, 0] 2 ™ to v for v G V c . 

\W{x) = X^Ah-Jx), for x G GF(2 m ), 

A& +1 > = ^ e v c \{vA { Z (2) 
Xt +1) (x) = Xl +1 \h cv x), for x G GF(2 m ), 

where Ai A2 G [—00, 0] 2 ™ is defined as follows. 

(AiHA 2 )(aO =ln( ^ e x 1 (x 1 )+x 2 ( X2 )^ 

x 1 ,x 2 eGF(2 m ):x=x 1 +x 2 

for x G GF(2 m ). We denote Ai B ■ ■ • B A fe by ^ =1 Aj. 
Increment the iteration round as I := £ + 1. 

VARIABLE TO CHECK : Each variable node v has j in- 
coming messages A™(c G C v ) G [— 00, 0] 2 " 1 sent from 
its j adjacent check nodes. The variable node v sends 
the message A^ to c for c G C v . 

X^{x)^\x)+ J2 (s) for s G GF(2"*). 

c'ec„\{c} 

And normalize A^ G [— 00, O] 2 ™ so that aIc'(O) = as 
follows. 

■■= A«(0), for x G GF(2 m ). 

TENTATIVE DECISION : The tentatively estimated sym- 
bol xffl G GF(2 m ) for the v-th transmitted symbol is 
given as 

if = argmax fj,^\x), 

xeGF(2 m ) 

^\x) := XM(x) + X cl(z) for x G GF(2 m ). 

c'ec v 



It can be easily seen that the outputs of this Log- 
SP algorithm is the same as those of the SP algorithm. 
The check node computation Eq. (2) still is the most 
complex part of the decoding. The computation in 
Eq. (2) can be viewed as a convolution in the loga- 
rithm domain. We refer to this operator • S • as the 
log-convolution. Such a log-convolution can not be cal- 
culated efficiently by FFT, IFFT and component-wise 
multiplications, since the messages are transformed in 
the logarithm domain. However, since the Log-SP al- 
gorithm does not need multiplications but additions, 
it is more robust to the quantization effects when the 
messages are stored on a small number of bits [13], [16]. 

For computing Eq. (2), using look-up tables is pro- 
posed in [13]. Declercq et al. proposed storing only 
the most contributing messages [14], which gave a good 
trade-off between the decoding complexity and the de- 
coding performance. 

2.3 Bottleneck Problem 

Due to the demand of the high speed and low quanti- 
zation level decoders, we can not afford multiplications 
which are computationally expensive. Consequently, 
one needs to choose the Log-SP algorithm rather than 
the SP algorithm. 

Compared to the variable node computations, the 
check node computations have two reasons for being the 
bottleneck of the Log-SP algorithm. First is obvious as 
seen so far. The computations in variable nodes are 
simple component-wise additions of message vectors, 
while the computations in check nodes need non-linear 
calculations as in Eq. (2). 

The second reason is that the number of incoming 
messages sent into check nodes is generally higher than 
that of variable nodes. For (j, fc)-rcgular non-binary 
LDPC codes, variable and check nodes have j and k in- 
coming message vectors, respectively. The coding rate 
R is given as R = (k — j)/k. Therefore, the number of 
incoming messages to check nodes are k/j = l/(l — R) 
times as higher as that of variable nodes. The ratio k/j 
gets higher as R — > 1 . 

Due to the above two asymmetry about computa- 
tion at variable nodes and check nodes, i.e., the number 
of incoming messages and the difference of computa- 
tion functions, we face a bottleneck problem of check 
node computations. One may think, in general, bottle- 
neck problems can be solved by using parallel proces- 
sors to allocate computation resources intensively to the 
bottleneck computations. However, since the variable 
and check node computations are triggered by incoming 
messages, the bottleneck problem of check node com- 
putations can not be solved even in the situation that 
fully node-parallel processing is possible. 

In the situation that each node-computation is 
processed in parallel, the total decoding time depends 
on the most complex node-computation among the 



all nodes. In this paper, we propose a decoding al- 
gorithm for non-binary LDPC codes, which reduces 
the largest node-computation amount of among all the 
nodes. Note that our interest is not for reducing the 
total amount of computations for decoding. 

3. New Fourier and Log-Fourier Sum-Product 
Algorithms for Non-Binary LDPC Codes 

In order to reduce the computation amount per check 
node which is a bottleneck in the Log-SP algorithm, 
we propose a decoding algorithm such that the role of 
variable nodes and check nodes are switched by initial- 
izing messages by the Fourier transform. To be precise, 
log-convolutions are done at the computation at vari- 
able nodes and component-wise additions are done the 
computation at check nodes . 

As a preparatory algorithm for the Log-Fourier SP 
algorithm that will be introduced in Section 3.2, firstly, 
in Section 3.1, we introduce the SP algorithm in the 
Fourier domain, which is referred to as the Fourier 
SP algorithm. The messages in the Fourier SP algo- 
rithm are Fourier transformed at the beginning and 
inverse Fourier transformed at the end. The computa- 
tions at variable node and check nodes, i.e., component- 
wise multiplications and convolutions, are switched in 
the Fourier domain. To be precise, with the proposed 
algorithm, messages are convoluted at variable nodes 
and messages are component- wisely multiplied at check 
nodes. The Fourier- SP algorithm is designed so that it 
outputs the same decoding results as SP and Log-SP al- 
gorithms do. Thus, the computational intensive tasks, 
i.e., convolutions are assigned to the variable nodes 
that have less incoming messages than check nodes. 
On the other hand, the computationally less intensive 
tasks, i.e., component-wise multiplications are assigned 
to check nodes which have larger incoming messages 
than variable nodes. 

3.1 New Fourier Sum-Product Algorithm for Non- 
Binary LDPC Code 

The following describes the Fourier SP algorithm. Note 
again that this is a preparatory algorithm for helping 
understand the algorithm in the next Section 3.2. 

INITIALIZATION : For each variable node v, the initial 
Fourier-transformed message Py G [-1, l] 2 is given 
as follows. 

p<°>(aO = Pr(X„ - x\Y v = y v ),x G GF(2 m ) 

x£GF(2">) 

for z e GF(2 m ). This can be done via FFT. Each vari- 
able node v sends the message = Pi°' G [— 1, l] 2 ™ 
to c for c e V c . Set I = 0. 



CHECK TO VARIABLE : Each check node c has k incom- 
ing messages Pvc G [— l,l] 2 ™(v G V c ) sent from its k 
adjacent variable nodes. The check node c sends the 
message Pci +r> G [—1, l] 2 ™ to v for v G V c . 

P$(z) = P$(H cv z), for z G GF(2 m ) 

P£ +1) (*) = II P v'l (*) for z e GF(2 m ), 

v'ev c \{v} 

P ( c i +1 \z) = Pg +1 \H- l z), for x G GF(2™). 

In Appendix A, we give the definition of H cv and the 
explanation that this step is equivalent to the check- 
to-variable step of the SP algorithm. Increment the 
iteration round as £ := £ + 1. 

VARIABLE TO CHECK : Each variable node v has j in- 
coming messages Pcv(c G C v ) G [—1, l] 2 ™ sent from its 
j adjacent check nodes. The variable node v sends the 
message pi! G [— 1, l] 2 ™ to c for c G C v . 

c'ec„\{c} 

where P\ ® P2 G [—1,1] 2 is a convolution of Pi G 
[-l,l] 2m and P 2 G [-1,1] 2 "\ To be precise, for x G 
GF(2 m ) 

(P 1 ®P 2 )(z)= ]T Pi(z 1 )P 2 (z 2 ). 

z 1 ,z 2 eGF(2 m ):z=z 1 +z 2 

And normalize G hM] 2 " 1 so that P^ c } (0) = 1 as 
follows. 

P^(z) := PW(z)/PW(0), for z G GF(2 m ). 

TENTATIVE DECISION : The tentatively estimated sym- 
bol x v for the v-th. transmitted symbol is given as 

x^ :— argmax q ( - e \x), (3) 

ieGF(2™) 

where for x, z G GF(2 m ), 

qw(z)=p?\z) n I*PM, 

qi e \x)= E Qi e) (z)(-ir X - (4) 

xeGF(2™) 

It is cumbersome that we have to apply the inverse 
Fourier transform Q\P as in Eq. (4) to decide tenta- 
tively estimated symbols in Eq. (3). We give an al- 
ternative way of determining estimated symbols, which 
does not need the inverse Fourier transform. 

It is known [18] that Eq. (3) gives the MAP sym- 
bol for the v-th transmitted symbol, when the Tan- 
ner graph is a tree. When the Tanner graph is not a 



tree, the approximated MAP symbol is obtained due 
to the sparseness of the Tanner graph. The symbol- 
MAP decoder minimizes the symbol error rate (SER) 
while the bit-MAP decoder minimizes the bit error rate 
(BER). For the digital communications, it is widely 
desirable to lower the BER rather than the SER. For 
x = (x\, . . . , x m ) G GF(2 m ), by marginalizing 



xi,...,x i _i,x i+ i,...,x m eGF(2) 

the approximated MAP bit x v s G GF(2) for the i-th 
bit in the i>-th transmitted symbol is obtained by 



x y vi := argmaxg^(xi). 

i,6GF(2) ' 

Let a be the fixed primitive element of GF(2 m ). For 
i = 1, . . . , m, a 4-1 G GF(2 m ) is represented as a m-bit 

i— 1 m — i 

sequence (0, . . . , 0, 1, 0, . . . , 0). It follows that 

Qi e Ha i - 1 )= E ^(^(-ir 1 " 1 - 

i£GF(2™) 

= E ^w(-i) Xi 

x£GF(2">) 

= <©0)-<©i). 

Thus, without the inverse Fourier transform, directly 
from gi £) G [-M] 2 " , we can calculate the approxi- 
mated MAP bit 



f if Q^V" 1 ) > 0, 
I 1 ifQ^la 1 - 1 ) <0. 



(5) 



In a similar way, we can calculate the syndromes 
without the inverse Fourier transform. For given es- 
timated symbol sequence {x(\ . . . ,x^) G G¥{2 m ) N 
with x { v e) = (x^,. . .,xi%) G GF(2) mAr , the syndrome 
symbol for a check node c is given by 

*c := ^ ' h cv Xy ^ . 

v£V c 

In a similar way, it can be shown that i-th bit s Ci j of 
the syndrome symbol s c is given as 

§ W = f if n, e y c ^ ) (^ J - 1 )>0 

0,1 1 1 if n, e y c ^ ) (^c^- i )<o, 

The sequence of the estimated symbols {xf^ , . . . , xi^ ) G 
GF(2 m ) N forms a codeword if sfj = for all c = 
1, . . . , M and i = 1, . . . , m. 

3.2 New Log-Fourier Sum-Product Algorithm for 
Non-Binary LDPC Codes 

In this section, we proposed the Fourier SP algorithm 



operated in the logarithm domain. The multiplications 
in the Fourier SP algorithm are replaced with additions 
in the in the logarithm domain. 

The Fourier SP algorithm requires many multipli- 
cations which can not be affordable for realizing the 
high speed and low quantization level decoders. In an 
analogous way as in Log-SP algorithm, we can con- 
sider the Fourier-SP algorithm in the logarithm do- 
main, which is referred to as Log-Fourier SP algorithm. 

For the sake of simple description of the algo- 
rithm, and in order to emphasize the analogy with the 
Fourier SP algorithm, we use a logarithm-like function 
T : [-1,1] -> GF(2) x [-oo,0] as follows. 

T(x) := (sgn GF(2) (x),ln(|x|)) G GF(2) x [-oo,0], 

( G GF(2) (a; > 0) 

s g n GF(2)( x ) : = { choose randomly or 1 (x = 0) 
[ 1 G GF(2) (x < 0). 

Obviously, for any non-zero real numbers x, y it holds 
that T(xy) = T(x) + T(y) and r _1 (-) is well-defined. 
The following describes the proposed Log-Fourier do- 
main decoding of non-binary LDPC codes. 

INITIALIZATION : For each variable node v, the ini- 
tial message A^ G (GF(2) x [-oo,0]) 2 "* is given as 
follows. 

p<°>(aO = Pr(A„ =x\Y v = y v ), x G GF(2 m ) 
P( \z)= E P^{z)(-iy x ,z G GF(2™), 

zGGF(2 m ) 

Ai°\z) = T(P^ \z)),z€ GF(2 m ). 

Each variable node v sends the message A^ = A^ ' G 
(GF(2) x [-co, 0]) 2 " 1 to c for c G C v . Set £ = 0. 

CHECK TO VARIABLE : Each check node c has k incom- 
ing messages A$ G (GF(2) x [— oo, 0]) 2m (v G V c ) sent 
from its k adjacent variable nodes. The check node c 
sends the message A^ +1) G (GF(2) x [-oo,0]) 2 "* to v 
for v G V c . 

AW(z) = AW(H cv z), 

= E A ivW fOT * e GF(2 m ) 

vev c \{v} 

K^\z) = Kt +1 \H^z). 
Increment the iteration round as I := I + 1. 

VARIABLE TO CHECK : Each variable node v has j in- 
coming messages A^(c G C v ) G (GF(2) x [-oo,0]) 2m 
sent from its j adjacent check nodes. The variable node 
v sends the message A$ G (GF(2) x [-oo,0]) 2 ™ to c 
for c G C v . 

A v i \z) = A v °\z)m c , eVMc} A < ~jl(z), 



where Ai ffl A 2 G (GF(2) x [-00, 0]) 2 ™ is defined as 
follows. 

(AifflA 2 )(a;) (6) 

= r( E r- 1 (A 1 ( a;i ) + A 2 (a ;2 ))), 

xi,x 2 €GF(2 7rl ):x=x 1 +X2 

for x G GF(2 m ). The difference between the 2 opera- 
tors and ffl is only a sign bit, which can be ignored. 
We also refer to the operator - ffl - as the log-convolution. 

TENTATIVE DECISION : The tentatively estimated sym- 
bol xffi for the v-tla. symbol is given as 

x^ = argmax pi^\x) 

a;eGF(2 m ) 

$>{x):= J2 ^W(-l)' 1 

zGGF(2™) 

M<?>(z) :=A^(z)m c , eVc A^ v (z). 

We calculated the MAP bit for the Fourier SP al- 
gorithm in Eq. (5). In a similar way we can calculate 
the MAP bit for the i-th bit in the v-th transmitted 
symbol x^\, without the inverse Fourier transform, di- 
rectly from M { v e) g (GF(2) x [-00, 0]) 2 "* as 

x { *l = the first entry of AfW(a i_1 ). 

In a similar way, it can be shown that i-th bit 
sfl G GF(2) of the syndrome symbol G GF(2 m ) 

of a check node c for the estimated MAP bits 

(v = l,...,N,i = 1, . . . , m) is calculated, without the 

inverse Fourier transform as 

45 = thc nrst entry of E M^i^a*" 1 ), 

V£V C 

for c = 1, . . . , M and i = 1, . . . , m. The sequence of the 
estimated symbols (xf \ . . . ,x\P) forms a codeword if 
s C; j = for all c = 1, . . . , M and i= 1, . . . , m. 

4. Comparison of Computation Amount 

In this section, we compare the computation amount of 
the conventional and proposed algorithms. 

In the conventional Log-SP algorithm, for each 
check node c, with the k incoming messages A^ G 
[— 00, 0] 2 for v G V c , c needs to compute a (k — 1)- 
fold log-convolution 

for v G V c , i.e., for k times. For each variable node 
v, with the j incoming messages A™ G [— 00, 0] 2 ™ for 
c G C v , v needs to compute a j-term component-wise 
addition 



+ X c'i^ for x e GF(2 m ). 

c'ec v \{c} 

for v G V c , i.e., for j times. 

While, in the proposed Log-Fourier SP algorithm, 
for each check node c, with the k incoming messages 
h-i 1 } G (GF(2), [-00, 0]) 2 " 1 for v G V c , c needs to com- 
pute a (k — l)-term component-wise addition 

E ^'c^) fOT z e GF(2 m ), 

»'£V e \{l)} 

for v G V c , i.e., for k times. For each variable node 
v, with the j incoming messages A™ G [— 00, 0] 2 ™ for 
c G C v , v needs to compute a j-fold log-convolution 

A(°)ffl c , eCA{c} AW 

for u G V c , i-e., for j times. 

The aim of this paper is reducing the most com- 
plex node-computation among all the node for the 
node-parallel implementation. And the most complex 
node-computation in both conventional Log-SP and 
proposed Log-Fourier SP is the log-convolution. In- 
deed, one component-wise addition of vector of length 
2™ requires only 2 m additions. On the other hand, 
one log-convolution requires as much as 2 2m compu- 
tations of additions, ln(-) and exp(-). Consequently, 
we focus our attention to thc computation amount of 
log-convolutions for both algorithms. We assume we 
use a (j — 2, fc)-regular non-binary LDPC code over 
GF(2 m ), since it is empirically known that good non- 
binary LDPC codes have parity-check matrices of col- 
umn weight 2 [11]. This property is extremely prefer- 
able for the Log-Fourier SP algorithm, since it is only 
needs 2 times log-convolutions per variable node. Ta- 
ble 1 compares the computation amount per node for 
(2, fc)-regular non-binary LDPC codes over GF(2" 1 ) for 
the conventional Log-SP and the proposed Log-Fourier 
SP algorithm. It can be seen that the proposed Log- 
Fourier SP algorithm needs only a constant number of 
the log-convolutions per node even if k gets lager to 
increase the coding rate R= {k — 2)/k. It can be seen 
that the proposed Log-Fourier SP algorithm needs less 
log-convolutions per node. 

Due to the intensive computation and the large 
number of incoming messages, the log-convolutions 
have been the main obstacles blocking parallelized im- 
plementations of high speed decoders for LDPC-codes. 
The number of necessary component- wise additions per 
node in the proposed Log- Fourier SP algorithm is larger 
that in the conventional Log-SP algorithm. Neverthe- 
less, the computation amount of log-convolutions per 
node is largely reduced. With the Log-Fourier SP algo- 
rithm, we can realize the node-parallel implementation 
which does not have the bottleneck problem. 

In the situation that each node-computation is pro- 
cessed in parallel, the total decoding time depends on 



Table 1 Comparison of the computation amount per node for (2, fc)-regular non-binary 
LDPC codes over GF(2 m ). ADD stands for the component-wise addition of vectors of length 
2 m . CONV stands for the log-convolution of vectors of length 2 m , as defined in Eq. (6). Usu- 
ally! k > 3 is used. One ADD requires only 2 m additions. On the other hand, one CONV 
requires as much as 2 2m computations of additions, ln(-) and exp(-). 





VARIABLE TO CHECK 


CHECK TO VARIABLE 


TENTATIVE DECISION 


Log-SP 


2-term ADD x2 


(k - l)-fold CONVxfc 


3-tcrm ADD 


Log-Fourier SP 


2-fold CONV x2 


(k - l)-tcrm ADDxfc 


3-fold CONV 



the most complex computation among the all nodes. 
The proposed Log-Fourier SP algorithm can reduce the 
largest computation amount of among all the nodes. To 
be precise, the (k — l)-fold log-convolutions for k times 
were the most complex node-computation in the con- 
ventional Lot-SP algorithm. The most complex node- 
computation in the Log-Fourier SP algorithm is re- 
duced to 2-fold log-convolutions for 2 times. 

5. Discussions and Conclusions 

In this paper, we proposed a decoding algorithm suit- 
able for fully node-parallel implementation of non- 
binary LDPC codes. The proposed algorithm reduces 
the most complex node-computation, which results 
large reduction of the total decoding time in the sit- 
uation that each node-computation is processed in par- 
allel. 

It should be noted that Hartman and Rudolph 
(HR) [19] developed the decoding algorithm for the 
dual code by using Fourier-transform. The HR de- 
coding makes the MAP decoding possible by decoding 
the dual codes with the Fourier transformed channel 
outputs. However, the application of HR decoding to 
LDPC codes have been limited to the decoding the con- 
stituent high-rate codes, e.g. single parity-check codes 
for LDPC codes. Gallager's / function [1, pp. 43] can 
be viewed as Fourier transforming log likelihood ratio 
(LLR) to the Log-Fourier domain, which reduce decod- 
ing of a single parity-check code to decoding a repeti- 
tion code. Isaka used the HR decoding the constituent 
Hamming codes for the generalized LDPC codes [20]. 
The dual code of LDPC code with a factor graph G 
is given by replacing "=" nodes and "+" nodes [21]. 
Dual of an LDPC code with parity-check matrix H is 
given by a low-density generator-matrix (LDGM) code 
with parity-check matrix (H T \I) by puncturing the bits 
corresponding to H T . The proposed algorithm can be 
viewed as a slightly modified application of the HR de- 
coding, not to the constituent codes, but to the whole 
LDPC code. The modification is that the decoding al- 
gorithm for the dual code of the LDPC code is not the 
MAP decoding but the Log-SP algorithm. 

Appendix A: Companion Matrix 

For the primitive elements a G GF(2 m ), we denote 
the corresponding primitive polynomial by tt(x) = 



7r + mx-\ Y^ m -\x m 1 + x m , where no,..., n r 

GF(2). The companion matrix of a is given as 



.4 





1 
1 







7Tl 

7T 2 

1 7T™_ 



As shown in [15], under m-bit representation of GF(2' 
symbols, it is readily checked that 



where the a? and ot l+J are interpreted as a m-bit vec- 
tors. For h cv — a 1 , we define H cv as a m x m binary 
matrix H cv — (A l ) T . Then we have h cv x = H cv x and 
h~^x = H~^x. We will show the check-to-variable step 
of the Fourier SP algorithm is equivalent with that of 
the SP algorithm. To this end, it is sufficient to show 
that the Fourier transform of Pvc IS Pvc ■ 



ieGF(2 m ) 



i£GF(2" 



i£GF(2 m ) 

= E pi e Rx)(-l) {H - z) - x = P(H? v z) = P(z). 

i£GF(2 m ) 
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